Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization
نویسنده
چکیده
We consider regularized stochastic learning and online optimization problems, where the objective function is the sum of two convex terms: one is the loss function of the learning task, and the other is a simple regularization term such as !1-norm for promoting sparsity. We develop extensions of Nesterov’s dual averaging method, that can exploit the regularization structure in an online setting. At each iteration of these methods, the learning variables are adjusted by solving a simple minimization problem that involves the running average of all past subgradients of the loss function and the whole regularization term, not just its subgradient. In the case of !1-regularization, our method is particularly effective in obtaining sparse solutions. We show that these methods achieve the optimal convergence rates or regret bounds that are standard in the literature on stochastic and online convex optimization. For stochastic learning problems in which the loss functions have Lipschitz continuous gradients, we also present an accelerated version of the dual averaging method.
منابع مشابه
Dual Averaging Method for Regularized Stochastic Learning and Online Optimization
We consider regularized stochastic learning and online optimization problems, where the objective function is the sum of two convex terms: one is the loss function of the learning task, and the other is a simple regularization term such as l1-norm for promoting sparsity. We develop a new online algorithm, the regularized dual averaging (RDA) method, that can explicitly exploit the regularizatio...
متن کاملDual Averaging and Proximal Gradient Descent for Online Alternating Direction Multiplier Method
We develop new stochastic optimization methods that are applicable to a wide range of structured regularizations. Basically our methods are combinations of basic stochastic optimization techniques and Alternating Direction Multiplier Method (ADMM). ADMM is a general framework for optimizing a composite function, and has a wide range of applications. We propose two types of online variants of AD...
متن کاملStochastic dual averaging methods using variance reduction techniques for regularized empirical risk minimization problems
We consider a composite convex minimization problem associated with regularized empirical risk minimization, which often arises in machine learning. We propose two new stochastic gradient methods that are based on stochastic dual averaging method with variance reduction. Our methods generate a sparser solution than the existing methods because we do not need to take the average of the history o...
متن کاملRandomized Block Subgradient Methods for Convex Nonsmooth and Stochastic Optimization
Block coordinate descent methods and stochastic subgradient methods have been extensively studied in optimization and machine learning. By combining randomized block sampling with stochastic subgradient methods based on dual averaging ([22, 36]), we present stochastic block dual averaging (SBDA)—a novel class of block subgradient methods for convex nonsmooth and stochastic optimization. SBDA re...
متن کاملReweighted l1 Dual Averaging Approach for Sparse Stochastic Learning
Recent advances in stochastic optimization and regularized dual averaging approaches revealed a substantial interest for a simple and scalable stochastic method which is tailored to some more specific needs. Among the latest one can find sparse signal recovery and l0-based sparsity inducing approaches. These methods in particular can force many components of the solution shrink to zero thus cla...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of Machine Learning Research
دوره 11 شماره
صفحات -
تاریخ انتشار 2010